Heterozygous genome assembly via binary classification of homologous sequence
نویسندگان
چکیده
منابع مشابه
Ranking via Robust Binary Classification
We propose RoBiRank, a ranking algorithm that is motivated by observing a close connection between evaluation metrics for learning to rank and loss functions for robust classification. It shows competitive performance on standard benchmark datasets against a number of other representative algorithms in the literature. We also discuss extensions of RoBiRank to large scale problems where explicit...
متن کاملGenome sequence and assembly of Bos indicus.
Cattle are divided into 2 groups referred to as taurine and indicine, both of which have been under strong artificial selection due to their importance for human nutrition. A side effect of this domestication includes a loss of genetic diversity within each specialized breed. Recently, the first taurine genome was sequenced and assembled, allowing for a better understanding of this ruminant spe...
متن کاملParallel Assembler for Fuzzy Genome Sequence Assembly
Assembly is an NP-Hard problem, which involves comparing fragments that have a time complexity of O(n). This paper presents a parallel approach for sequence assembly. The parallel technique is based on classification to group organisms by similarity rather than an embarrassingly parallel approach that requires duplication of the data across all nodes. This process of classification, based on DN...
متن کاملGenome Sequence Assembly Using Trace Signals and Additional Sequence Information
Motivation: This article presents a method for assembling shotgun sequences which primarily uses high confidence regions whilst taking advantage of additional available information such as low confidence regions, quality values or repetitive region tags. Conflict situations are resolved with routines for analysing trace signals. Results: Initial tests with different human and mouse genome proje...
متن کاملIn vitro, long-range sequence information for de novo genome assembly via transposase contiguity.
We describe a method that exploits contiguity preserving transposase sequencing (CPT-seq) to facilitate the scaffolding of de novo genome assemblies. CPT-seq is an entirely in vitro means of generating libraries comprised of 9216 indexed pools, each of which contains thousands of sparsely sequenced long fragments ranging from 5 kilobases to > 1 megabase. These pools are "subhaploid," in that th...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: BMC Bioinformatics
سال: 2015
ISSN: 1471-2105
DOI: 10.1186/1471-2105-16-s7-s5